representation system
Understanding Multimodal Hallucination with Parameter-Free Representation Alignment
Wang, Yueqian, Liang, Jianxin, Wang, Yuxuan, Zhang, Huishuai, Zhao, Dongyan
Beijing Institute for Wangxuan Institute of Computer Technology, Peking University General Artificial Intelligence National Key Laboratory of General Artificial Intelligence wangyuxuan1@bigai.ai Hallucination is a common issue in Multimodal Large Language Models (MLLMs), yet the underlying principles remain poorly understood. In this paper, we investigate which components of MLLMs contribute to object hallucinations. To analyze image representations while completely avoiding the influence of all other factors other than the image representation itself, we propose a parametricfree representation alignment metric (Pfram) that can measure the similarities between any two representation systems without requiring additional training parameters. Notably, Pfram can also assess the alignment of a neural representation system with the human representation system, represented by ground-truth annotations of images. By evaluating the alignment with object annotations, we demonstrate that this metric shows strong and consistent correlations with object hallucination across a wide range of state-of-the-art MLLMs, spanning various model architectures and sizes. Furthermore, using this metric, we explore other key issues related to image representations in MLLMs, such as the role of different modules, the impact of textual instructions, and potential improvements including the use of alternative visual encoders. Multimodal Large Language Models (MLLMs) have been rapidly advancing in recent days Dai et al. (2023); Liu et al. (2023c;b); Zhang et al. (2023); Dong et al. (2024); Bai et al. (2023).
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
The Music Note Ontology
Poltronieri, Andrea, Gangemi, Aldo
In this paper we propose the Music Note Ontology, an ontology for modelling music notes and their realisation. The ontology addresses the relation between a note represented in a symbolic representation system, and its realisation, i.e. a musical performance. This work therefore aims to solve the modelling and representation issues that arise when analysing the relationships between abstract symbolic features and the corresponding physical features of an audio signal. The ontology is composed of three different Ontology Design Patterns (ODP), which model the structure of the score (Score Part Pattern), the note in the symbolic notation (Music Note Pattern) and its realisation (Musical Object Pattern).
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
- (3 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
The HaMSE Ontology: Using Semantic Technologies to support Music Representation Interoperability and Musicological Analysis
Poltronieri, Andrea, Gangemi, Aldo
The use of Semantic Technologies - in particular the Semantic Web - has revealed to be a great tool for describing the cultural heritage domain and artistic practices. However, the panorama of ontologies for musicological applications seems to be limited and restricted to specific applications. In this research, we propose HaMSE, an ontology capable of describing musical features that can assist musicological research. More specifically, HaMSE proposes to address issues that have been affecting musicological research for decades: the representation of music and the relationship between quantitative and qualitative data. To do this, HaMSE allows the alignment between different music representation systems and describes a set of musicological features that can allow the music analysis at different granularity levels.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (5 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Optimal Approximation with Sparse Neural Networks and Applications
We use deep sparsely connected neural networks to measure the complexity of a function class in $L^2(\mathbb R^d)$ by restricting connectivity and memory requirement for storing the neural networks. We also introduce representation system - a countable collection of functions to guide neural networks, since approximation theory with representation system has been well developed in Mathematics. We then prove the fundamental bound theorem, implying a quantity intrinsic to the function class itself can give information about the approximation ability of neural networks and representation system. We also provides a method for transferring existing theories about approximation by representation systems to that of neural networks, greatly amplifying the practical values of neural networks. Finally, we use neural networks to approximate B-spline functions, which are used to generate the B-spline curves. Then, we analyse the complexity of a class called $\beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.
- North America > United States > New York (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Research Report (0.50)
- Workflow (0.46)
Notation system allows scientists to communicate polymers more easily
Having a compact, yet robust, structurally-based identifier or representation system for molecular structures is a key enabling factor for efficient sharing and dissemination of results within the research community. Such systems also lay down the essential foundations for machine learning and other data-driven research. While substantial advances have been made for small molecules, the polymer community has struggled in coming up with an efficient representation system. For small molecules, the basic premise is that each distinct chemical species corresponds to a well-defined chemical structure. This does not hold for polymers.
Notation system allows scientists to communicate polymers more easily
Having a compact, yet robust, structurally-based identifier or representation system for molecular structures is a key enabling factor for efficient sharing and dissemination of results within the research community. Such systems also lay down the essential foundations for machine learning and other data-driven research. While substantial advances have been made for small molecules, the polymer community has struggled in coming up with an efficient representation system. For small molecules, the basic premise is that each distinct chemical species corresponds to a well-defined chemical structure. This does not hold for polymers.
A semi-holographic hyperdimensional representation system for hardware-friendly cognitive computing
Serb, A., Kobyzev, I., Wang, J., Prodromakis, T.
One of the main, long-term objectives of artificial intelligence is the creation of thinking machines. To that end, substantial effort has been placed into designing cognitive systems; i.e. systems that can manipulate semantic-level information. A substantial part of that effort is oriented towards designing the mathematical machinery underlying cognition in a way that is very efficiently implementable in hardware. In this work we propose a 'semi-holographic' representation system that can be implemented in hardware using only multiplexing and addition operations, thus avoiding the need for expensive multiplication. The resulting architecture can be readily constructed by recycling standard microprocessor elements and is capable of performing two key mathematical operations frequently used in cognition, superposition and binding, within a budget of below 6 pJ for 64- bit operands. Our proposed 'cognitive processing unit' (CoPU) is intended as just one (albeit crucial) part of much larger cognitive systems where artificial neural networks of all kinds and associative memories work in concord to give rise to intelligence.
- North America > Canada (0.04)
- Europe > United Kingdom > England > Hampshire > Southampton (0.04)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- (4 more...)
- Education (0.68)
- Health & Medicine (0.46)
Deep Neural Network Approximation Theory
Grohs, Philipp, Perekrestenko, Dmytro, Elbrächter, Dennis, Bölcskei, Helmut
Deep neural networks have become state-of-the-art technology for a wide range of practical machine learning tasks such as image classification, handwritten digit recognition, speech recognition, or game intelligence. This paper develops the fundamental limits of learning in deep neural networks by characterizing what is possible if no constraints on the learning algorithm and the amount of training data are imposed. Concretely, we consider information-theoretically optimal approximation through deep neural networks with the guiding theme being a relation between the complexity of the function (class) to be approximated and the complexity of the approximating network in terms of connectivity and memory requirements for storing the network topology and the associated quantized weights. The theory we develop educes remarkable universality properties of deep networks. Specifically, deep networks are optimal approximants for vastly different function classes such as affine systems and Gabor systems. This universality is afforded by a concurrent invariance property of deep networks to time-shifts, scalings, and frequency-shifts. In addition, deep networks provide exponential approximation accuracy i.e., the approximation error decays exponentially in the number of non-zero weights in the network of vastly different functions such as the squaring operation, multiplication, polynomials, sinusoidal functions, general smooth functions, and even one-dimensional oscillatory textures and fractal functions such as the Weierstrass function, both of which do not have any known methods achieving exponential approximation accuracy. In summary, deep neural networks provide information-theoretically optimal approximation of a very wide range of functions and function classes used in mathematical signal processing.
- Europe > Austria > Vienna (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
New Decimal Systems - Great Sandbox for Data Scientists and Mathematicians
We illustrate pattern recognition techniques applied to an interesting mathematical problem: The representation of a number in non-conventional systems, generalizing the familiar base-2 or base-10 systems. The emphasis is on data science rather than mathematical theory, and the style is that of a tutorial, requiring minimum knowledge in mathematics or statistics. However, some off-the-beaten-path, state-of-the-art number theory research is discussed here, in a way that is accessible to college students after a first course in statistics. This article is also peppered with mathematical and statistical oddities, for instance the fact that there are units of information smaller than the bit. You will also learn how the discovery process works, as I have included research that I thought would lead me to interesting results, but did not. In all scientific research, only final, successful results are presented, while actually most of the research leads to dead-ends, and is not made available to the reader.
Multilingual Topic Models
Krstovski, Kriste, Kurtz, Michael J., Smith, David A., Accomazzi, Alberto
Scientific publications have evolved several features for mitigating vocabulary mismatch when indexing, retrieving, and computing similarity between articles. These mitigation strategies range from simply focusing on high-value article sections, such as titles and abstracts, to assigning keywords, often from controlled vocabularies, either manually or through automatic annotation. Various document representation schemes possess different cost-benefit tradeoffs. In this paper, we propose to model different representations of the same article as translations of each other, all generated from a common latent representation in a multilingual topic model. We start with a methodological overview on latent variable models for parallel document representations that could be used across many information science tasks. We then show how solving the inference problem of mapping diverse representations into a shared topic space allows us to evaluate representations based on how topically similar they are to the original article. In addition, our proposed approach provides means to discover where different concept vocabularies require improvement.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)